Family-Joining: A Fast Distance-Based Method for Constructing Generally Labeled Trees

نویسندگان

  • Prabhav Kalaghatgi
  • Nico Pfeifer
  • Thomas Lengauer
چکیده

The widely used model for evolutionary relationships is a bifurcating tree with all taxa/observations placed at the leaves. This is not appropriate if the taxa have been densely sampled across evolutionary time and may be in a direct ancestral relationship, or if there is not enough information to fully resolve all the branching points in the evolutionary tree. In this article, we present a fast distance-based agglomeration method called family-joining (FJ) for constructing so-called generally labeled trees in which taxa may be placed at internal vertices and the tree may contain polytomies. FJ constructs such trees on the basis of pairwise distances and a distance threshold. We tested three methods for threshold selection, FJ-AIC, FJ-BIC, and FJ-CV, which minimize Akaike information criterion, Bayesian information criterion, and cross-validation error, respectively. When compared with related methods on simulated data, FJ-BIC was among the best at reconstructing the correct tree across a wide range of simulation scenarios. FJ-BIC was applied to HIV sequences sampled from individuals involved in a known transmission chain. The FJ-BIC tree was found to be compatible with almost all transmission events. On average, internal branches in the FJ-BIC tree have higher bootstrap support than branches in the leaf-labeled bifurcating tree constructed using RAxML. 36% and 25% of the internal branches in the FJ-BIC tree and RAxML tree, respectively, have bootstrap support greater than 70%. To the best of our knowledge the method presented here is the first attempt at modeling evolutionary relationships using generally labeled trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Op-molb160127 2720..2734

The widely used model for evolutionary relationships is a bifurcating tree with all taxa/observations placed at the leaves. This is not appropriate if the taxa have been densely sampled across evolutionary time and may be in a direct ancestral relationship, or if there is not enough information to fully resolve all the branching points in the evolutionary tree. In this article, we present a fas...

متن کامل

Fine Grain Parallel Construction of Neighbour-joining Phylogenetic Trees with Reduced Redundancy Using Multithreading

In biological research, scientists often need to use the information of the species to infer the evolutionary relationship among them. The evolutionary relationships are generally represented by a labeled binary tree, called the evolutionary tree (or phylogenetic tree). The phylogeny problem is computationally intensive, and thus it is suitable for parallel computing environment. In this paper,...

متن کامل

Fe b 20 06 WHY NEIGHBOR - JOINING WORKS

We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson’s optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson’s criterion is not satisfied. We also provide a proof for Atteson’s conjecture on the optimal edge radi...

متن کامل

Heuristics for Speeding Up Neighbour Joining

The neighbour joining method is a distance based method for constructing evolutionary trees [2]. Conceptually, it starts out with a star-formed graph where each leaf corresponds to a species, and iteratively picks two nodes adjacent to the root and joins them, by inserting a new node between the root and the two selected nodes. When joining nodes, the method selects the node pair i; j that mini...

متن کامل

Quicktree - SD Stephan

Phylogenetic methods are becoming part of the standard methodologies used in the analyses of biological sequence data, but several of the classical tools of phylogenetic inference were not designed with high throughput applications in mind. During our surveys, we haven't found software for the fast inference of neighbor joining trees from sequence alignments which could use biologically realist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2016